DNN-based uncertainty estimation for weighted DNN-HMM ASR
نویسندگان
چکیده
In this paper, the uncertainty is defined as the mean square error between a given enhanced noisy observation vector and the corresponding clean one. Then, a DNN is trained by using enhanced noisy observation vectors as input and the uncertainty as output with a training database. In testing, the DNN receives an enhanced noisy observation vector and delivers the estimated uncertainty. This uncertainty in employed in combination with a weighted DNN-HMM based speech recognition system and compared with an existing estimation of the noise cancelling uncertainty variance based on an additive noise model. Experiments were carried out with Aurora-4 task. Results with clean, multi-noise and multicondition training are presented.
منابع مشابه
An improved uncertainty decoding scheme with weighted samples for DNN-HMM hybrid systems
In this paper, we advance a recently-proposed uncertainty decoding scheme for DNN-HMM (deep neural network hidden Markov model) hybrid systems. This numerical sampling concept averages DNN outputs produced by a finite set of feature samples (drawn from a probabilistic distortion model) to approximate the posterior likelihoods of the context-dependent HMM states. As main innovation, we propose a...
متن کاملIntegration of DNN based speech enhancement and ASR
Speech enhancement employing Deep Neural Networks (DNNs) is gaining strength as a data-driven alternative to classical Minimum Mean Square Error (MMSE) enhancement approaches. In the past, Observation Uncertainty approaches to integrate MMSE speech enhancement with Automatic Speech Recognition (ASR) have yielded good results as a lightweight alternative for robust ASR. In this paper we thus exp...
متن کاملRobust i-vector based adaptation of DNN acoustic model for speech recognition
In the past, conventional i-vectors based on a Universal Background Model (UBM) have been successfully used as input features to adapt a Deep Neural Network (DNN) Acoustic Model (AM) for Automatic Speech Recognition (ASR). In contrast, this paper introduces Hidden Markov Model (HMM) based ivectors that use HMM state alignment information from an ASR system for estimating i-vectors. Further, we ...
متن کاملImproving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM
Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...
متن کاملMulti-task learning deep neural networks for speech feature denoising
Traditional automatic speech recognition (ASR) systems usually get a sharp performance drop when noise presents in speech. To make a robust ASR, we introduce a new model using the multi-task learning deep neural networks (MTL-DNN) to solve the speech denoising task in feature level. In this model, the networks are initialized by pre-training restricted Boltzmann machines (RBM) and fine-tuned by...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1705.10368 شماره
صفحات -
تاریخ انتشار 2017